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Abstract  —  Data  association,  or  determining  corre¬ 
spondence  between  targets  and  measurements,  is  a  very 
difficult  problem  that  is  of  great  practical  importance. 
In  this  paper  we  formulate  the  classical  multi-target 
data  association  problem  as  a  graphical  model  and 
demonstrate  the  remarkable  performance  that  approx¬ 
imate  inference  methods,  specifically  loopy  belief  propa¬ 
gation,  can  provide.  We  apply  it  to  calculating  marginal 
association  weights  (e.g.,  for  JPDA )  for  single  scan  and 
multiple  scan  problems,  and  to  calculating  a  MAP  hy¬ 
pothesis  (i.e.,  multi- dimensional  assignment).  Through 
computational  experiments  involving  challenging  prob¬ 
lems,  we  demonstrate  the  remarkable  performance  of 
this  very  simple,  polynomial  time  algorithm;  e.g.,  errors 
of  less  than  0. 026  in  marginal  association  weights  and 
finding  the  optimal  5D  assignment  99. f%  of  the  time 
for  a  problem  with  realistic  parameters.  Impressively, 
the  formulation  commits  smaller  errors  in  association 
weights  in  challenging  environments,  i.e.,  in  problems 
with  low  Pd  and/or  high  false  alarm  rates.  Our  for¬ 
mulation  paves  the  way  for  the  expanding  literature  on 
approximate  inference  methods  in  graphical  models  to 
be  applied  to  classical  data  association  problems. 
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1  Introduction 

In  recent  years,  graphical  models  have  emerged  as  a 
powerful  tool  for  inference  and  learning  in  large  scale 
systems.  The  promise  of  graphical  models  in  tracking 
problems  was  demonstrated  in  [1,  2,  3].  The  formula¬ 
tion  in  the  former  focussed  on  sensor  networks,  in  which 
each  sensor  had  a  narrow  field  of  view.  Non-overlapping 
regions  were  defined  and  association  variables  were  in¬ 
stantiated  to  hypothesise  joint  association  events  for 
all  targets  and  measurements  within  a  region.  In  the 
present  study,  we  consider  the  classical  data  association 
problem,  in  which  a  single  sensor  surveils  a  large  num¬ 
ber  of  targets.  Each  target  may  give  rise  to  at  most 
one  measurement,  and  each  measurement  is  related  to 
at  most  one  target.  In  a  sense,  the  resulting  method 


decomposes  the  formulation  of  [2],  such  that  each  mea¬ 
surement  is  within  its  own  region. 

We  focus  on  approximate  solutions  to  two  core  prob¬ 
lems  in  data  association:  firstly,  calculating  marginal 
association  probabilities  such  as  those  used  in  Joint 
Probabilistic  Data  Association  (JPDA)  [4],  and  sec¬ 
ondly,  finding  a  maximum  a  posteriori  (MAP)  associ¬ 
ation  configuration,  similar  to  that  sought  in  [5].  Af¬ 
ter  introducing  our  notation  and  the  tools  of  graphi¬ 
cal  models  in  Section  2,  we  provide  (in  Section  3)  two 
graphical  model  formulations  of  the  single  scan,  single 
sensor  data  association  problem,  and  demonstrate  the 
remarkable  performance  that  is  achieved  in  Section  4.1. 
Our  method  may  be  easily  extended  to  problems  involv¬ 
ing  multiple  sensors  and  multiple  time  steps;  we  sketch 
this  development  in  Section  3.3,  and  demonstrate  re¬ 
sults  in  Section  4.2. 

2  Background 

2.1  Data  association  model 

We  now  describe  this  classical  model,  introducing  our 
notation  as  we  proceed.  We  denote  by  nt  the  number  of 
targets  under  track.  While  this  is  assumed  fixed  (and 
known),  varying  (and  uncertain)  numbers  of  targets  can 
be  accommodated  easily  through  a  framework  such  as 
[6].  We  denote  by  x\  G  R",  i  €  {1, . . .  ,nt},  the  state 
of  the  i-th  target  at  time  t,  and  by  Xt  =  (a:)) , . . . ,  a:™*) 
the  augmented  state  of  all  targets  at  time  t. 

Our  measurement  model  hypothesises  a  set  of  mea¬ 
surements  comprised  of  possible  target  detections  and 
false  alarms.  We  denote  by  mt  the  number  of  measure¬ 
ments  at  time  t,  and  by  zJt  G  Rm,  j  G  {1, . . . ,  mt},  the 
value  of  the  j-th  measurement.  A  target  in  state  x\ 
will  be  detected  with  probability  of  detection  Pd{x\). 
The  measurement  resulting  from  this  detection  is  dis¬ 
tributed  according  to  the  PDF  p(z\x\)  and  is  indepen¬ 
dent  of  all  other  measurements  and  targets. 

False  alarms  occur  according  to  a  non-homogeneous 
Poisson  point  process  with  intensity  Af a(z).  False  alarm 
measurements  are  independent  of  targets  and  their 
measurements.  The  complete  set  of  measurements  at 
time  t  is  denoted  as  Zt  =  (mt,  z\, . . . ,  z™*).  The  or¬ 
dering  of  measurements  within  this  vector  is  arbitrary. 
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The  complete  set  of  measurements  up  to  and  including 
time  t  is  denoted  as  Z 4  =  (Z i, . . . ,  Zt). 

The  relationship  between  targets  and  measurements 
is  described  via  a  set  of  association  variables.  The  as¬ 
sociation  variables  comprise  of  either  or  both  of  the 
following: 

1.  For  each  target  i  G  {1, . . . ,  nt},  an  association  vari¬ 
able  a\  G  {0, 1, . . .  ,mt},  the  value  of  which  is  an 
index  to  the  measurement  with  which  the  target  is 
hypothesised  to  be  associated  (zero  if  the  target  is 
hypothesised  to  have  not  been  detected) 

2.  For  each  measurement  j  G  {1, . . . ,  nit},  an  associa¬ 
tion  variable  bj  €  {0,1,...,  rz^},  the  value  of  which 
is  an  index  to  the  target  with  which  the  measure¬ 
ment  is  hypothesised  to  be  associated  (zero  if  the 
measurement  is  hypothesised  to  be  a  false  alarm) 

Note  that  the  two  sets  of  association  variables  are  en¬ 
tirely  redundant:  given  the  information  from  either  set, 
the  other  can  be  reconstructed  perfectly.  We  consider 
two  formulations;  the  first,  involving  only  target  associ¬ 
ation  variables  ( a j, . . . ,  a”*),  we  refer  to  as  a  target  ori¬ 
ented  formulation;  the  second,  which  incorporates  both 
sets  of  variables,  we  refer  to  as  a  hybrid  formulation. 
A  similar  measurement  oriented  formulation  has  been 
considered  but  is  omitted  here  due  to  space  limitations; 
it  is  closely  related  to  the  well-known  Probabilistic  Mul¬ 
tiple  Hypothesis  Tracker  (PMHT)  [7]. 

Through  straight-forward  and  well-known  manipula¬ 
tions,  the  joint  probability  of  measurements  and  associ¬ 
ations  conditioned  on  target  states  can  be  written  as:1 


p(Zt,at\Xt)  oc 


^(pd{x\)p{zf\x\)\ 

i= 1  V  -^fa  (zt*)  ) 

■  (1  -  Pd Ozj))1-^ 


M°t) 


4>c{at ) 


(1) 


where  at  =  ( a . . . ,  a™*),  0d{a\)  is  the  detection  flag: 


oM)  = 


0,  a\  =  0 


and  V’c(dt)  is  the  global  consistency  constraint 


V’c  (at)  = 


0,  3  i,j  G  {1, . ..  ,nt}  s.t.  a\  =  a{  ±  0 
1 ,  otherwise 


The  proportionality  in  Eq.  (1)  is  with  respect  to  all 
parameters  other  than  the  number  of  observations  and 
observation  values  (both  of  which  are  constants  at  run¬ 
time). 


2.2  Graphical  models 

Graphical  models  [8,  9,  10]  aim  to  represent  and  ma¬ 
nipulate  the  joint  probability  distributions  of  many 
variables  efficiently  by  exploiting  factorisation.  The 
Kalman  filter  [11]  and  the  hidden  Markov  model 
(HMM)  [12]  are  two  examples  of  algorithms  that  exploit 
sparsity  of  a  particular  kind  (i.e.,  a  Markov  chain)  to 
efficiently  conduct  inference  on  systems  involving  many 
probability  variables.  Inference  methods  based  on  the 
graphical  model  framework  generalise  these  algorithms 
to  a  wider  variety  of  state  spaces  and  dependency  struc¬ 
tures. 

Graphical  model  methods  have  been  developed  for 
undirected  graphical  models  (Markov  random  fields), 
directed  graphical  models  (Bayes  nets)  and  factor 
graphs.  We  prefer  factor  graphs  [13]  as  they  provide 
the  most  expressive  form  of  characterisation  of  the  de¬ 
pendency  structure.  In  the  general  case,  we  have  nodes 
(i.e.,  random  variables)  n  G  A f,  and  factors  (i.e.,  de¬ 
pendencies)  /  G  T ,  where  each  /  is  equipped  with  a  set 
of  neighbouring  nodes,  r//  C  J\f .  The  joint  distribution 
is  factored  as: 

P{xm)  OC  1pn{xn)  n  ^/(  Xr)f  ) 

neN  fer 

The  functions  ip.  (•)  collectively  represent  the  joint  prob¬ 
ability  distribution.  For  example,  a  Markov  chain  in¬ 
volving  variables  (aq, . . . ,  xn )  may  be  formulated  using 
a  factor  ip\{x\)  =  p(xi)  for  the  initial  prior,  and  factors 
ipk-i,k(xk-i,xk)  =  p(xk\xk-i),  k  G  {2, . . . ,  n}  repre¬ 
senting  the  Markov  transition  kernels,  although  many 
other  formulations  are  possible.  Optimal  inference  can 
be  conducted  on  tree-structured  factor  graphs  using  be¬ 
lief  propagation  (BP).  BP  proceeds  by  passing  messages 
between  nodes  and  factors.  We  denote  by  pn-tf{xn)  the 
message  sent  from  node  nG  ??/  to  factor  /,  by  p f^n(xn) 
the  message  sent  from  factor  /  to  node  n  G  rjf  and  by 
Vn  =  {/  G  T\n  G  /}  the  factors  involving  node  n.  The 
iterative  update  equations  are  then: 

Mn— > f(,Xn}  =  1pn(,Xn)  |  j  /t£_>n(xn)  (2) 

£6 fin\{/} 

Pf^n{xn)  =  /  IpfiXrjj,)  p^^f(x^) 

£e»)/\{n} 

(3) 

where  the  integral  in  Eq.  (3)  is  taken  over  the  appro¬ 
priate  measure,  i.e.,  a  counting  measure  for  discrete 
components,  and  a  Lebesgue  measure  for  continuous 
components.  At  convergence,  the  marginal  distribution 
at  a  node  n  can  be  calculated  as: 

p(xn)  OC  1pn(xn)  P^n{xn)  (4) 


1Abusing  notation,  we  set  f(za)°  =  1  even  when  za  is  unde¬ 
fined  (i.e.,  when  a  =  0). 


In  the  case  of  a  Markov  chain,  if  all  nodes  are  jointly 
Gaussian,  then  BP  is  equivalent  to  a  Kalman  smoother. 


Similarly,  if  all  nodes  are  discrete,  then  BP  is  equivalent 
to  inference  on  an  HMM  using  the  forwards-backwards 
algorithm.  BP  extends  each  of  these  algorithms  from 
chains  to  trees. 

BP  may  be  applied  to  loopy  graphs  (so-called  loopy 
belief  propagation).  Practically,  this  simply  involves  re¬ 
peated  application  of  Eqs.  (2)  and  (3)  until  convergence 
occurs  (i.e.,  until  the  maximum  error  between  subse¬ 
quent  messages  is  less  than  a  pre-set  threshold).  Un¬ 
fortunately,  this  is  neither  guaranteed  to  converge  to 
the  right  answer,  nor  to  converge  at  all,  although  re¬ 
markable  performance  has  been  demonstrated  in  var¬ 
ious  applications  [14].  Conceptually,  one  can  always 
convert  an  arbitrary  loopy  graph  to  a  tree  by  merging 
nodes  (so-called  junction  tree  representations  [8]),  but 
in  practical  problems,  the  dimensionality  of  the  agglom¬ 
erated  variables  may  be  prohibitive. 

BP  may  also  be  applied  to  different  probability  dis¬ 
tributions.  In  a  graph  involving  a  combination  of  dis¬ 
crete  and  Gaussian  nodes  and  factors,  all  continuous 
messages  will  assume  the  form  of  Gaussian  mixtures. 
Particle  BP  [15]  extends  particle  filtering  methods  from 
Markov  chains  to  general  graphical  models. 

3  Graphical  model  data  associa¬ 
tion 

In  this  section,  we  construct  graphical  model  formula¬ 
tions  of  the  data  association  problem  described  in  Sec¬ 
tion  2.1,  thus  permitting  us  to  solve  data  association 
problems  using  the  machinery  described  in  Section  2.2. 
The  formulations  are  shown  in  Fig.  1.  In  the  follow¬ 
ing  sections,  we  define  the  factors  and  write  equivalent 
forms  of  Eq.  (1)  which  respect  the  corresponding  graph 
structures. 

We  do  not  include  graph  nodes  for  measurements  in 
any  of  the  graphs  in  Fig.  1.  Since  they  are  known  at 
run-time,  they  do  not  need  to  be  explicitly  represented 
as  nodes.  Rather,  the  graph  directly  represents  the  con¬ 
ditional  dependency  structure,  conditioned  on  the  mea¬ 
surements,  and  the  measurements  appear  as  parameters 
in  the  factors  in  the  graph. 

3.1  Target  oriented 

First,  we  study  the  target  oriented  formulation,  shown 
in  Fig.  1(a).  The  constraint  that  each  target  gives 
rise  to  at  most  one  measurement  is  implicit  in  the  al¬ 
phabet  of  the  association  variables  a\.  The  constraint 
that  each  measurement  corresponds  to  at  most  one  tar¬ 
get  is  explicitly  enforced  by  the  consistency  constraints 
ifd (a),  a3).  Assuming  that  the  prior  distribution  of  Xt 
factorises  into  a  product  of  terms  in  each  target  x\ ,  we 
can  write  the  Bayes  update  using  Eq.  (1)  as 


nt 

p(Xt,at|Z4)  af] 

i=l 


(ipp(xl,alt)  V’dK, at)] 
V  j=i+ 1  / 


(5) 


C‘/.  i  (ttf  ■  ) 


ip<:\  (af  ap ) 


Vcl  [Ufat 


(a)  target-oriented  formulation 


(b)  hybrid  formulation 


Figure  1:  Target-oriented  and  hybrid  formulations  of 
the  data  association  problem  using  graphical  models. 


where 


i!>p(x\,a\) 


l/kiKX) 


[l-PdixiMxHZ*-1), 

Pd(x't)p(zatt\x,t)p(x\\Zt  *) 


0,  a\  —  a{  0 
1 ,  otherwise 


<4  =  o 
<4^o 


(6) 


(7) 


This  expression  respects  the  graph  structure  of 
Fig.  1(a),  i.e.,  the  expression  is  made  up  of  factors  that 
appear  in  the  graph,  involving  variables  to  which  the 
respective  factors  in  the  graph  are  connected. 

If  the  consistency  constraint  factors  V’ci (aj,<4)  are 
omitted  from  Fig.  1(a),  the  resulting  association 
weights  (i.e.,  the  marginal  probabilities  of  a\)  are  equiv¬ 
alent  to  the  weights  calculated  by  Probabilistic  Data 
Association  (PDA)  [16].  Armed  with  our  graphical  for¬ 
mulation,  the  JPDA  weights  can  be  calculated  by  ap¬ 
plying  classical  graph  inference  methods.  As  gating  will 
generally  restrict  each  target  to  associate  with  a  small 
number  of  measurements,  consistency  constraint  fac¬ 
tors  %fci(a\,  <4)  only  need  to  be  incorporated  between 
targets  (i,  j)  that  may  possibly  associate  with  the  same 
measurement.  The  resulting  sparsity  can  be  exploited 
by  the  junction  tree  method  [8]  to  calculate  the  ex¬ 
act  marginal  probabilities  of  the  association  weights  a\ 
in  the  presence  of  the  consistency  constraints.  This 
can  be  implemented  very  easily  using  freely  available 


libraries  such  as  libDAI  [17],  taking  as  an  input  the 
PDA  weights  and  the  constraint  tables  ipc\(a\,a3t)\  the 
result  is  quite  similar  to  the  fast  mutual  exclusion  al¬ 
gorithm  of  [18,  19].  The  disadvantage  of  the  method  is 
that  its  complexity  is  exponential  in  the  width  of  the 
junction  tree.  Consequently,  its  computational  com¬ 
plexity  is  problematic  when  many  targets  may  share 
observations. 

Alternatively,  and  somewhat  naively,  we  may  sim¬ 
ply  apply  loopy  BP.  As  we  will  see  (in  Section  4.1), 
the  performance  obtained  by  applying  this  approximate 
method  to  the  present  graph  is  somewhat  disappoint¬ 
ing.  This  is  not  surprising,  given  that  all  association 
variables  are  connected  to  each  other,  such  that  they 
form  a  large  clique.  The  hybrid  model  described  in  Sec¬ 
tion  3.2  obtains  much  better  performance  with  loopy 
BP. 


3.2  Hybrid 

The  hybrid  formulation  is  shown  in  Fig.  1(b).  In 
this  case,  the  constraints  that  each  measurement  cor¬ 
responds  to  at  most  one  target  and  each  target  gives 
rise  to  at  most  one  measurement  are  both  implicit  in 
the  alphabets  of  the  respective  association  variables, 
a\  and  bj .  The  constraint  factors  tp1^  (alt,  b{)  enforce 
consistency  of  the  representations,  i.e.,  that  the  redun¬ 
dant  association  variables  a\  and  bJt  describe  the  same 
association  configuration.  The  definition  of  the  factor 
%pp(x\,a\)  is  unchanged  from  Eq.  (6),  while  the  new 
constraint  factor  ip1^ (alt,  bj)  is 


M)  = 

so  that  we  can  write  Eq.  (1)  as 


0,  a\  =  j,  b3t^  i  or  bJt  =  i,  a\  ±  j 
1,  otherwise 


(8) 


3.3  Multiple  scans/sensors 

The  formulations  described  so  far  have  concentrated 
on  the  problem  that  exists  within  a  single  time  step. 
Extension  to  multiple  time  steps  and/or  multiple  sen¬ 
sors  simply  involves  a  replication  of  the  nodes  for  each 
time  step,  connecting  the  target  states  over  time  with 
factors  that  encode  the  Markov  transition  kernel.3  In 
many  tracking  problems,  the  resulting  graph  will  be 
mixed  discrete-Gaussian,  so  that  the  messages  sent  be¬ 
tween  continuous  nodes  (and  from  discrete  to  contin¬ 
uous)  are  in  the  form  of  Gaussian  mixtures.  In  our 
experiments,  we  approach  this  by  marginalising  over 
the  continuous  target  states.  When  marginalising  vari¬ 
ables  in  a  graphical  model,  we  must  connect  all  of  the 
variables’  neighbouring  factors  to  all  of  those  factors’ 
other  neighbours.  The  result  is  a  purely  discrete  graph 
which  contains  for  each  target  an  augmented  associa¬ 
tion  variable  a 1  =  (aj_a+1, . . .  ,a\)  that  hypothesises  a 
sequence  of  associations  over  all  s  scans  in  considera¬ 
tion.  These  augmented  association  variables  are  con¬ 
nected  to  sets  of  measurement  association  variables  for 
each  scan  (using  the  hybrid  formulation  of  Section  3.2). 
In  a  sense  this  is  equivalent  to  using  Gaussian  mixture 
messages  (as  there  is  a  bijection  between  augmented 
target  association  variable  values  and  Gaussian  mixture 
components),  but  options  for  message  simplification  are 
somewhat  more  limited  (essentially,  only  pruning  may 
be  used) .  Experiments  with  formulations  that  either  di¬ 
rectly  use  Gaussian  mixture  messages,  or  so-called  weak 
marginalisation  [10]  are  a  subject  of  future  work. 

Max-product  BP  may  also  be  performed  on  this 
graph,  resulting  in  an  approximate  algorithm  for  multi¬ 
dimensional  assignment.  This  is  similar  in  concept  to 
[21],  although  the  formulation  therein  assumes  an  ad¬ 
ditive  decomposition  of  association  weights  (as  a  sum 
of  pairwise  distances),  rather  than  the  standard  associ¬ 
ation  model  that  we  utilise. 


nt 

p(Xt,at|Z4)  cx[| 

i= 1 


^ipp(xl,alt)  fj  V’cJK,bt)j 


(9) 


When  an  optimal  inference  algorithm  is  applied  to  this 
model,  the  result  will  be  identical  to  the  exact  result  ob¬ 
tained  from  the  target  oriented  model;  the  formulation 
enforces  constraints  in  a  different  (but  equivalent)  man¬ 
ner.  However,  as  we  will  see,  approximate  algorithms 
may  obtain  quite  different  results. 

The  recent  work  [20]  proves  that  loopy  max-product 
BP  applied  to  a  minor  variation  of  the  hybrid  graph  in 
Fig.  1(b)  is  guaranteed  to  converge  to  the  MAP  assign¬ 
ment.2  While  this  guarantee  does  not  extend  to  the 
use  of  loopy  sum-product  BP  in  finding  marginal  asso¬ 
ciation  weights,  the  experiments  detailed  in  Section  4.1 
demonstrate  its  remarkable  performance  when  applied 
to  the  hybrid  formulation. 

2assuming  uniqueness  of  the  optimum. 


4  Experiments 

We  consider  a  scenario  in  which  a  group  of  nt  tar¬ 
gets  travel  in  grid  formation  over  a  2D  planar  field 
with  equidistant  spacing  between  adjacent  targets.  The 
state  dynamics  and  target-derived  measurement  equa¬ 
tions  are 


=  F^-r 


A  =  h*! 


where  the  state  of  each  target  x\,  consists  of  position 
and  velocity  in  two  dimensions,  w\  ~  A/”{0,  Q}  and 
v\  ~  A/”{0,  R}  are  i.i.d.  dynamics  and  measurement 
noise  processes  respectively,  H  =  [1  0]  ®  I2,  R  =  I2, 


F  = 


T 

1 


)  I2  and  Q  =  q 


T 

T2/  2 


T2/  2 
T3/ 3 


3For  the  multiple  sensor  case,  sets  of  association  nodes  for 
each  sensor  are  connected  to  the  same  set  of  target  state  nodes. 
The  target  state  prior  p(xl\Zt— 1)  that  is  incorporated  into 
Eq.  (6)  is  only  included  in  the  ^p(as J,  a\)  factor  for  the  first  time 
step/sensor. 


The  sample  period  T  =  1  sec,  q  =  10-2,  and  I2  is  a 
2x2  identity  matrix.  Targets  are  detected  with  prob¬ 
ability  Pd  =  0.6  (unless  otherwise  stated),  and  false 
alarms  are  distributed  uniformly  in  a  100  x  100  region 
(which  covers  all  targets),  arriving  according  to  a  Pois¬ 
son  process  where  the  expected  total  number  is  equal 
to  0.1  (unless  otherwise  stated).4  State  estimates  and 
covariances  are  initialised  with  values  consistent  with 
those  that  would  be  obtained  through  a  random  pro¬ 
cess  involving  five  measurements  at  consecutive  time 
steps  with  no  association  uncertainty.  For  each  exper¬ 
iment,  1000  Monte  Carlo  runs  were  performed.  The 
libDAI  [17]  implementation  of  loopy  BP  was  used  with 
parallel  updates,  a  convergence  threshold  of  10-6  and 
a  maximum  of  1000  iterations. 

4.1  Single  scan 

4.1.1  Marginal  association  probability  errors 

We  first  consider  a  scenario  involving  six  targets4  in  a 
regular  2x3  grid.  We  vary  the  spacing  between  the 
targets  and  examine  the  behaviour  of  our  approximate 
algorithms,  comparing  the  marginal  association  weights 
calculated  to  the  exact  JPDA  weights.  We  examine  the 
average  maximum  association  weight  error  per  target 
for  the  target-oriented  formulation  (Section  3.1)  and 
the  hybrid  formulation  (Section  3.2).  We  also  show 
the  error  in  the  PDA  weights  (i.e.,  the  weights  calcu¬ 
lated  ignoring  consistency  constraints)  for  reference,  as 
a  gauge  of  the  degree  of  interaction  between  targets, 
and  hence  of  the  difficulty  of  the  scenario. 

Results  of  experiments  with  probability  of  detection 
Pd  =  0.3,  0.6  and  0.9  are  shown  in  Fig.  2(a-c,f-h,k-m). 
The  plots  of  the  average  maximum  association  proba¬ 
bility  error  per  target  in  (a-c)  demonstrate  that,  while 
the  target-oriented  formulation  of  Section  3.1  (dashed 
line)  commits  errors  of  the  same  order  of  magnitude  as 
PDA  (dot-dashed  line),  the  errors  in  the  weights  cal¬ 
culated  by  the  hybrid  formulation  of  Section  3.2  (solid 
line)  are  quite  small.  For  Pd  =  0.9  the  errors  are  rea¬ 
sonably  small  (<  0.083),  but  for  lower  Pd,  they  are 
remarkably  small  (<  0.026  for  Pd  =  0.6,  and  <  0.006 
for  Pd  =  0.3).  The  results  of  experiments  with  an  in¬ 
creased  false  alarm  rate  (from  Afa  =  0.1  to  Afa  =  1 
and  Afa  =  2)  with  Pd  =  0.9  are  shown  in  Fig.  2(d- 
e,i-j,n-o).  The  results  show  that  the  accuracy  of  the 
approximate  weights  calculated  by  the  hybrid  formula¬ 
tion  improves  as  the  false  alarm  rate  increases  (error 
<  0.072  for  Afa  =  1  and  <  0.068  for  Afa  =  2). 

One  factor  which  acts  to  reduce  coupling  in  the  graph 
(and  hence  reduce  errors)  is  the  “missed  detection” 
events,  under  which  there  is  no  competition  between 
targets  for  measurements.  These  events  are  more  prob¬ 
able  when  Pd  is  lower,  and  when  Afa  is  higher.  Ac¬ 
cordingly,  as  demonstrated  in  the  results,  the  BP-based 

4This  relatively  small  number  was  chosen  to  permit  calcula¬ 
tion  of  the  exact  weights  (for  comparison)  in  reasonable  time. 


algorithm  can  be  expected  to  produce  better  approxi¬ 
mations  of  association  weights  in  challenging  tracking 
environments,  i.e.,  those  involving  lower  probability  of 
detection  and/or  higher  false  alarm  rates. 

The  plots  in  Fig.  2(f-j)  show  the  percentage  of  cases 
in  which  loopy  BP  converges,  while  (k-o)  show  the  av¬ 
erage  number  of  iterations  (excluding  non-convergent 
cases).  The  results  show  that,  while  the  target-oriented 
formulation  exhibits  significant  difficulties  with  non¬ 
convergence,  the  hybrid  formulation  converges  very  re¬ 
liably  (100%  and  >  97.6%  of  trials)  for  Pd  =  0.3  and 
0.6.  While  convergence  is  less  reliable  (>  46.4%  of  tri¬ 
als)  for  Pd  =  0.9,  the  average  error  differs  little  between 
cases  that  converge  and  those  that  do  not,  so  the  oscil¬ 
lation  that  is  occurring  would  appear  to  be  between  rea¬ 
sonably  good  solutions.  With  higher  false  alarm  rates 
(and  Pd  =  0.9),  the  hybrid  formulation  converges  very 
reliably  (>  99.6%  and  100%  of  trials  for  Afa  =  1  and 
Afa  =  2  respectively).  No  attempt  was  made  to  employ 
BP  convergence  aids  such  as  damped  updates. 

Given  the  significant  performance  advantage  of  the 
hybrid  formulation  over  the  target-oriented  formula¬ 
tion,  the  subsequent  experiments  employ  it  exclusively. 
The  marked  difference  in  the  performance  of  the  target 
oriented  formulation  and  the  hybrid  formulation  raises 
the  question  of  whether  there  may  be  structures  in¬ 
volving  further  redundancy  that  provide  additional  in¬ 
creases  in  performance;  this  is  a  topic  of  future  study. 

4.1.2  Scalability  of  loopy  BP  method 

Our  second  experiment  examines  the  scalability  of  our 
proposed  method  by  calculating  association  weights  for 
a  square  grid  of  between  2x2  and  10  x  10  targets  (i.e., 
between  four  and  100  targets),  with  a  spacing  of  four 
units.  Gating  is  applied  conservatively  such  that  a  fac¬ 
tor  is  not  incorporated  between  a  target  and  a  measure¬ 
ment  if  the  PDA  weight  is  less  than  ICC6  (the  alphabet 
of  the  measurement  association  variable  is  constrained 
accordingly).  In  the  10  x  10  target  case,  the  maximum 
number  of  measurements  to  which  a  target  is  connected 
is  11.3  on  average  (averaging  over  the  1000  Monte  Carlo 
trials),  and  the  targets  are  connected  to  6.3  measure¬ 
ments  on  average  (averaging  over  targets  and  Monte 
Carlo  trials).  This  level  of  connectivity  across  a  dense 
grid  is  unable  to  be  addressed  by  any  exact  method. 

Fig.  2(p)  shows  the  average  True  Measurement  Prob¬ 
ability  (TMP),  i.e.,  the  marginal  weight  of  the  correct 
association  for  the  target,  for  different  problem  sizes 
(solid  line) ,  and  the  average  number  of  iterations  to  con¬ 
vergence  (solid  line  with  crosses).  The  diagram  shows 
that  the  quality  of  the  solution  suffers  little  degrada¬ 
tion  as  the  problem  size  increases,  and  that  the  num¬ 
ber  of  iterations  required  grows  at  a  relatively  slow  rate. 
Asymptotically,  we  expect  that  this  will  level  out  to  a 
constant  value  as  the  number  of  targets  exceeds  the 
graph  connectivity,  e.g.,  since  associations  of  distant 


targets  are  of  little  consequence.  Loopy  BP  converged 
in  all  trials  and  all  network  sizes. 

The  average  run  time  for  the  100  target  scenario 
was  19.4  sec.  Without  gating,  there  are  nt  x  mt.  fac¬ 
tors,  each  of  which  involves  computation  (when  imple¬ 
mented  using  a  general-purpose  inference  library)  of 
(nt  +  1)  x  (mt  +  1)  elements.  Our  naive  implemen¬ 
tation  with  conservative  gating  involves  (on  average) 
793  factors,  each  of  which  involves  (on  average)  over 
4900  elements.  The  number  of  factors  can  be  reduced 
by  using  a  less  conservative  gating  strategy  (10-3  is  a 
more  common  threshold).  The  structure  of  the  con¬ 
straint  matrices  may  also  be  exploited  to  obtain  faster 
run  times;  specifically,  each  target-measurement  and 
measurement-target  message  involves  only  two  differ¬ 
ent  values.  This  is  not  exploited  in  our  implementa¬ 
tion,  which  utilises  the  general-purpose  libDAI  library, 
such  that  all  values  are  calculated  independently.  We 
expect  that  an  optimised,  purpose  specific  implementa¬ 
tion  could  speed  computation  by  two  orders  of  magni¬ 
tude  on  the  same  hardware  and  using  the  same  conser¬ 
vative  gating  strategy.  In  an  optimised  implementation 
(neglecting  gating),  each  iteration  of  target  association 
variable  to  measurement  association  variable  message 
calculations  will  require  (in  total)  2 ntm^  floating  point 
operations,  while  each  iteration  of  measurement  asso¬ 
ciation  variable  to  target  association  variable  message 
calculations  will  require  2n^mt  operations.  In  addition 
to  these  optimisations,  the  message  passing  structure 
of  BP  lends  itself  to  parallelisation,  allowing  for  easy 
exploitation  of  multi-core  architectures  ( e.g .,  [22]). 

4.2  Multiple  scans 

4.2.1  Marginal  association  weights 

The  experiment  setup  described  in  Section  4  was  also 
applied  to  scenarios  involving  two,  three  and  four  scans 
of  measurements  (with  six  targets  in  a  2  x  3  grid, 
spaced  by  four  units).  We  apply  loopy  BP  to  the  en¬ 
tire  graph  (spanning  multiple  time  steps)  as  described 
in  Section  3.3,  comparing  results  to  JPDA,5  in  which 
marginal  weights  for  each  target  are  calculated  exactly 
for  each  time  step,  and  then  propagated  forward  ap¬ 
proximating  the  past  joint  association  weights  as  the 
product  of  the  marginal  weights  for  each  target.  The 
computational  complexity  of  an  optimised  implemen¬ 
tation  of  this  is  approximately  ntrih  Y^k=t-s+i  mt  anc^ 
nt  Sfc=t-s+l  mt  floating  point  operations  (in  total)  for 
each  alternating  iteration,  where  n-h  is  the  number  of 
(single  target)  association  hypotheses  per  target.  We 
cannot  solve  for  the  optimal  augmented  (over  time) 

5Whereas  standard  JPDA  approximates  the  posterior  as  a 
unimodal  Gaussian,  we  retain  the  full  Gaussian  mixture  for  each 
target  with  the  corresponding  weights.  Mixture  reduction  is  ob¬ 
viously  necessary  in  practice;  in  this  work  we  retain  the  full  rep¬ 
resentation  in  order  to  concentrate  on  the  problem  at  hand,  i.e., 
approximate  inference  of  association  weights. 


weights  in  a  problem  of  this  size.  BP  failed  to  con¬ 
verge  in  only  0,  1,  1  and  3  cases  out  of  1000  in  the  1, 
2,  3  and  4  scan  experiments  respectively.  The  average 
number  of  iterations  required  was  24,  27,  31  and  28 
respectively  (excluding  non-convergent  cases). 

Our  results  are  shown  in  Fig.  2(q-t).  The  x  and  y 
axes  plot  TMP  in  the  final  time  step  for  loopy  BP  and 
JPDA  respectively.  Intuitively,  we  expect  that  higher 
TMP  values  represent  better  performance.  Points  to 
the  right  of/below  the  x  =  y  line  on  the  plots  represent 
cases  in  which  loopy  BP  yields  a  better  result,  while 
points  above/to  the  left  of  the  x  =  y  line  represent 
cases  in  which  JPDA  yields  a  better  result.  The  plots 
for  the  single  scan  case  (Fig.  2(q))  reveal  more  about 
the  nature  of  the  errors  committed  by  loopy  BP  in  the 
experiments  in  the  previous  section.  In  this  case,  the 
JPDA  weights  represent  the  exact  marginal  weights  for 
the  scenario.  In  comparison,  the  BP  result  tends  to 
apply  higher  weight  to  the  true  measurement  when  its 
weight  is  higher,  and  lower  weight  to  the  true  measure¬ 
ment  when  its  weight  is  lower.  The  overall  average  error 
is  consistent  with  the  result  shown  in  Fig.  2(b).  In  the 
2,  3  and  4  scan  experiments  in  Fig.  2(r-t),  there  are 
progressively  fewer  cases  in  which  BP  assigns  a  higher 
weight  to  the  true  measurement,  and  a  small  number 
of  cases  appear  in  which  BP  assigns  very  low  weight 
(e.g.,  ss  0.01)  to  the  true  hypothesis  when  JPDA  as¬ 
signs  a  greater  value  (e.g.,  «  0.1  —  0.2).  Consequently, 
although  BP  yields  a  slightly  higher  (<  1%)  mean  TMP 
in  all  experiments,  it  generally  exhibits  a  slightly  lower 
(<  0.8%)  mean  log  TMP,  which  is  arguably  the  correct 
measure  for  comparison.  The  erroneously  low  weights 
are  of  practical  importance  as  they  would  likely  lead  to 
pruning  of  the  correct  hypothesis  in  a  small  proportion 
of  cases. 

It  is  plausible  that  BP  might  be  able  to  provide  bet¬ 
ter  performance  than  JPDA  over  multiple  time  steps 
as  it  is  able  to  implicitly  represent  correlation  between 
target  association  events  in  earlier  time  steps.  This  im¬ 
provement  is  not  evident  in  this  experiment.  Rather  it 
seems  that  the  strong  loops  that  arise  due  to  coupling 
of  time  steps  causes  occasional  anomalous  behaviour. 
Future  work  is  required  on  this  topic  to  examine  the 
causes  of  these  effects,  identify  cases  in  which  problems 
are  likely  to  occur,  and  develop  alternative  methods  for 
application  in  these  cases. 

4.2.2  Multi-dimensional  assignment 

As  mentioned  in  Section  3.3,  the  max-product  variant 
of  loopy  BP  can  also  be  applied  to  find  an  approximate 
MAP  solution,  i.e.,  an  approximate  solution  to  multi¬ 
dimensional  assignment  problems.  To  demonstrate  the 
potential  of  the  algorithm,  we  applied  max-product  BP 
to  the  problem  described  in  the  previous  section,  and 
compared  to  the  optimal  solution  obtained  using  the 
Matlab  Optimization  Toolbox.  The  results  for  10,000 
Monte  Carlo  trials  are  summarised  in  Table  1. 


#  Scans 

Conv 

Feas 

Optimal 

#  Iter 

1 

9995 

10000 

10000 

9 

2 

9926 

9938 

9929 

10 

3 

9941 

9953 

9935 

10 

4 

9942 

9953 

9937 

11 

Table  1:  Results  of  loopy  max-product  BP  applied 
to  multi-dimensional  assignment  problems.  Columns 
show  number  of  scans  (i.  e.,  number  of  dimensions  in 
assignment  problem  minus  1),  number  of  Monte  Carlo 
trials  (out  of  10,  000)  in  which  BP  converged  (we  ex¬ 
pect  that  all  simulations  in  the  one-scan  case  would 
have  converged  if  iteration  was  permitted  to  continue), 
number  of  trials  in  which  BP  produced  a  feasible  an¬ 
swer  (i.e.,  one  which  obeys  the  consistency  constraints), 
number  of  trials  in  which  BP  produced  an  optimal  so¬ 
lution  [i.e.,  an  objective  within  10-6  of  the  integer  pro¬ 
gramming  solution),  and  average  number  of  iterations 
required  for  BP  to  converge  (excluding  non-convergent 
cases). 

These  results  demonstrate  the  potential  of  loopy  BP  as 
a  viable  alternative  solution  for  multi-dimensional  as¬ 
signment  problems.  For  example,  even  in  a  four  scan 
(5D)  assignment  problem,  the  optimal  solution  was 
found  in  99.4%  of  cases.  Our  future  work  on  this  prob¬ 
lem  includes  testing  the  formulation  in  more  stressing 
scenarios,  and  applying  Tree  Re- weighted  Max  Product 
(TRMP)  [23],  which  can  improve  performance  over  max 
product  BP,  and  provide  a  “certificate  of  optimality”  in 
some  circumstances. 

5  Discussion 

The  experiments  presented  in  Section  4  demonstrate 
the  remarkable  performance  of  the  hybrid  graphical 
model  formulation  of  Section  3.2  for  data  association 
problems.  The  experiments  were  designed  to  provide 
challenging  conditions  involving  a  range  of  interaction 
strengths  and  a  large  number  of  targets.  The  perfor¬ 
mance  of  the  graphical  model-based  method  is  clearly 
impressive,  especially  in  challenging  environments,  i.e., 
problems  with  low  Pd  and/or  high  false  alarm  rates. 
This  performance,  coupled  with  the  low  order  polyno¬ 
mial  complexity  of  the  method,  demonstrate  that  loopy 
BP  is  a  very  attractive  method  for  both  calculation  of 
marginal  association  weights  and  multi-dimensional  as¬ 
signment. 

Our  future  work  includes  comparing  to  existing  ap¬ 
proximate  methods,  incorporating  varying  and  uncer¬ 
tain  target  counts,  and  employing  and/or  developing 
advanced  inference  methods  to  address  the  small  pro¬ 
portion  of  cases  where  performance  is  problematic. 
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Figure  2:  Results  of  computational  experiments,  (a-o)  show  the  average  maximum  association  weight  error  per 
target,  the  average  number  of  iterations  and  percentage  of  cases  in  which  loopy  BP  converges  for  the  experiments 
in  Section  4.1.1.  The  solid  line  shows  the  loopy  BP  hybrid  formulation,  the  dashed  line  shows  the  loopy  BP  target- 
oriented  formulation,  while  the  dot-dashed  line  shows  the  PDA  weights  (neglecting  consistency  constraints),  (p) 
shows  the  average  marginal  probability  of  the  true  measurement  (or  True  Measurement  Probability,  TMP)  with 
an  unmarked  line  and  average  number  of  iterations  with  a  cross  marked  line  for  the  scalability  experiments 
(Section  4.1.2).  (q-t)  show  TMP  for  loopy  BP  ( x  axis)  and  JPDA  ( y  axis)  for  experiments  involving  1,  2,  3  and 
4  scans  of  measurements  (Section  4.2.1). 


